“Transformation of the symbolic into the geometric” [McCormick et al. 1987]
“… finding the artificial memory that best supports our natural means of perception.” [Bertin 1967]
“The use of computer-generated, interactive, visual representations of data to amplify cognition.” [Card, Mackinlay, & Shneiderman 1999]
| g | mean_x | mean_y | sd_x | sd_y |
|---|---|---|---|---|
| 1 | 9 | 7.500909 | 3.316625 | 2.031568 |
| 2 | 9 | 7.500909 | 3.316625 | 2.031657 |
| 3 | 9 | 7.500000 | 3.316625 | 2.030424 |
| 4 | 9 | 7.500909 | 3.316625 | 2.030578 |
graphical building blocks
The visual properties that varie
All channels are not equals
The interesting part is not already available
1281768756138976546984506985604982826762 9809858458224509856458945098450980943585 9091030209905959595772564675050678904567 8845789809821677654876364908560912949686
1281768756138976546984506985604982826762 9809858458224509856458945098450980943585 9091030209905959595772564675050678904567 8845789809821677654876364908560912949686
manufacturer.
model.
displ. engine displacement, in litres
m_cty = mpg %>% group_by(manufacturer) %>% summarize(mcty=mean(cty))
ggplot(data=m_cty)+
geom_bar(aes(x=manufacturer,y=mcty),stat = 'identity')+
scale_x_discrete("Manufacturer")+
scale_y_continuous("Miles / Gallon (City conditions)")m_cty_ordered = m_cty %>% arrange(desc(mcty)) %>%
mutate(manufacturer=factor(manufacturer,levels=manufacturer))
ggplot(data=m_cty_ordered)+
geom_bar(aes(x=manufacturer,y=mcty),stat = 'identity')+
scale_x_discrete("Manufacturer")+
scale_y_continuous("Miles / Gallon (City conditions)")ggplot(data=m_cty_ordered)+
geom_bar(aes(x=manufacturer,y=mcty),stat = 'identity')+
scale_x_discrete("Manufacturer")+
scale_y_continuous("Miles / Gallon (City conditions)")+
coord_flip()# read data and pre-processing
url = "./data/sp_Lyon.json"
data=fromJSON(file=url)
extract = function(x){
data.frame(id=x$'_id',
time= x$download_date,
nbbikes = x$available_bikes )
}
st_tempstats.df=do.call(rbind,lapply(data,extract))
sel = st_tempstats.df %>% select(id) %>% unique() %>% sample_n(8) %>% pull()
# selection de quelques stations
st_tempstats_sub.df = st_tempstats.df %>%
filter(id %in% sel)ggplot(data=st_tempstats_sub.df)+
geom_line(aes(x=time,y=nbbikes,group=id,color=factor(id)),size=2)+
facet_grid(id ~ .)mpg_su = mpg %>%
filter(class %in% c('compact','suv','pickup','minivan'))
ggplot(mpg_su)+geom_point(aes(x=cty,y=hwy,color=class))mpg_su = mpg %>%
filter(class %in% c('compact','suv','pickup','minivan'))
ggplot(mpg_su)+geom_point(aes(x=cty,y=hwy,shape=class))\[\textrm{Lie factor} = \frac{\textrm{visual effect size}}{\textrm{data effect size}}\]
knowing that the “apple”" area (in green ) is equal to \(2.22\,cm^2\) and that the rim area (in blue) is equal to \(2.96\,cm^2\) compute the lyong factor ?
\[S = I^p\]
\[\textrm{graph data density} = \frac{\textrm{number of entries in data matrix}}{\textrm{area of data display}} \]
\[\textrm{data-ink ratio} = \frac{\textrm{area of data-ink}}{\textrm{total area of ink}}\]
https://speakerdeck.com/cherdarchuk/remove-to-improve-the-data-ink-ratio
+geom_line()
aes(x=a,y=b,...)
ggplot(mpg)+
geom_point(aes(x=cty,y=hwy,color=manufacturer,shape=factor(cyl)))
ggplot(mpg,aes(x=cty,y=hwy,color=manufacturer,shape=factor(cyl)))+
geom_jitter()
+geom_line()
aes(x=a,y=b,...)
scale_fill_brewer(palette=3,type="qual")
scale_x_continuous(limits=c(0,45),breaks=seq(0,45,2))
ggplot(mpg,aes(x=cty,y=hwy,color=manufacturer,shape=factor(cyl)))+
geom_jitter()+
scale_x_continuous(limits=c(0,45),breaks=seq(0,45,2))+geom_line()
aes(x=a,y=b,...)
scale_fill_brewer(palette=3,type="qual")
scale_x_continuous(limits=c(0,45),breaks=seq(0,45,2))
facet_grid(. ~ cyl)
+geom_line()
aes(x=a,y=b,...)
scale_fill_brewer(palette=3,type="qual")
scale_x_continuous(limits=c(0,45),breaks=seq(0,45,2))
stat_density2d()
Update the scale and labels
# téléchargement et remise en forme des données
url = "./data/sp_Lyon.json"
data=fromJSON(file=url)
extract = function(x){
data.frame(id=x$'_id',
time= x$download_date,
nbbikes = x$available_bikes )
}
st_tempstats.df=do.call(rbind,lapply(data,extract))
# selection de 3 stations
st_tempstats_sub.df = st_tempstats.df %>%
filter(id %in% sel)
ggplot(data=st_tempstats_sub.df)+
geom_line(aes(x=time,y=nbbikes,group=id,color=factor(id)),size=2)+
facet_grid(id ~ .)Update the scale and labels
Reproduce this graphic (Iris data)
Reproduce this graphic (mtcars data) ! modifier le theme du graphique ?theme
Reproduce this graphic